Method for stabilization of proteins using non-natural amino acids

ABSTRACT

The present invention provides a method for producing modified stable polypeptides introducing at least one non-natural amino acid into the hydrophobic region of the polypeptide. The thermal and chemical stability of such polypeptides is improved compared to those properties of its corresponding wild type proteins. 
     The invention further provides purified leucine zipper and coiled-coil proteins in which the leucine residues have been replaced with 5,5,5-trifluoroleucines, and the modified proteins so produced demonstrate increased thermal and chemical stability compared to their corresponding wild-type natural proteins.

This work was supported by U.S. Army Research Grant DAAG559810518. Thegovernment may have rights in this invention.

Throughout this application various publications are referenced. Thedisclosures of these publications in their entireties are herebyincorporated by reference into this application in order to more fullydescribe the state of the art to which this invention pertains.

FIELD OF THE INVENTION

The present invention relates to improved stabilization of polypeptidesby incorporation of non-natural amino acids, such as hyper-hydrophobicamino acids, into the hydrophobic core regions of the polypeptides.

BACKGROUND OF THE INVENTION

Engineering of stable enzymes and robust therapeutic proteins is ofcentral importance to the biotechnology and pharmaceutical industries.The primary internal driving forces for stabilizing proteins involvevarious interactions such as desolvation, electrostatic interaction,hydrogen bonding, and van der Waal forces, and a proper balance of theseinteractions is necessary for the appropriate folding of a protein.Although protein engineering provides powerful tools for the enhancementof enzymatic activity and protein stability (J. L. Cleland, C. S. Craik,Protein Engineering: Principles and Practice (Wiley-Liss, New York,N.Y., 1996); D. Mendel, J. A. Ellman, Z. Y. Chang, D. L. Veenstra, P. A.Kollman, Science 256, 1798 1992; A. R. Fersht, L. Serrano, Curr. Opin.Struct. Biol. 3, 75 1993; B. W. Matthews, Adv. Protein Chem. 46, 2491995), the scope of engineering of proteins is limited by thefunctionality offered by the twenty naturally occurring proteinogenicamino acids (V. W. Cornish, D. Mendel, P. G. Schultz, Angew. Chem. Int.Ed. Engl. 34, 621, 1995), permitting only modest and unpredictable gainsin stability by modifying the protein sequence.

Non-natural amino acids that contain unique side chain functional groupsincluding halogens, unsaturated hydrocarbons, heterocycles, silicon,organometallic units, can offer advantages in improving the stability ofthe folded structure of proteins without requiring sequencemodifications. Functionalities orthogonal to that of the naturallyoccurring amino acids, including alkenes (van Hest, J. C. M. et al.,1998. FEBS Lett., 428, 68-70), alkynes (van Hest, J. C. M.; Kiick, L.K.; Tirrell, D. A. J. Am. Chem. Soc. 2000, 122, 1282-1288), aryl halides(Sharma, N.; Furter, R.; Kast, P.; Tirrell, D. A. FEBS Lett. 2000, 467,37-40) and electroactive side chains (Kothakota, S.; Fournier, M. J.;Tirrell, D. A.; Mason, T. L. J. Am. Chem. Soc. 1995, 117, 536-537) havebeen incorporated into proteins prepared in bacterial cultures.Trifluoromethionine has been inserted into bacteriophage lambda lysozymein vivo and serves as a unique probe for ¹⁹F NMR spectroscopy (Duewel,H.; Daub, E.; Robinson, V.; Honek, J. F. Biochemistry 1997, 36,3404-3416). Trifluoroleucine was reported more than 30 years ago tosupport bacterial cell growth and to be incorporated into nascentproteins in the absence of leucine during biosynthesis (Rennert, O. M.;Anker, H. S. Biochemistry 1963, 2, 471). In addition, substitution ofamino acids such as serine or alanine that normally comprise thehydrophillic region(s) of the proteins, but are also present, to alesser degree, in the hydrophobic regions, with the fluoro derivativesis likely to result in stronger inter-helical interaction, thusresulting in improved stability.

Leucine-zipper domains occur commonly in protein assemblies such aseukaryotic transcription factors (O'Shea, E. K, Rutkowski, R., Kim, P.S. Science 1989, 243, 538-542; Lumb, K. J, Kim, P. S. Science 1995, 268,436-438; Wendt, H., Baici, A., Bosshard, H. R.; J. Am. Chem. Soc. 1994,116, 6073-6074; Gonzales, L., Brown, R. A., Richardson, D., Alber, T.Nat. Struct. Biol. 1996, 3, 1002-1100; Kenar, K. T., Garcia-Moreno, B.,Freire, E. Protein Sci. 1995, 4, 1934-1938; Mohanty, D., Kolinski, A.,Skolnick, J. Biophys. J. 1999, 77, 54-69; d'Avignon, D. A., Bretthorst,G. L., Holtzer, M. E., Holtzer, A. Biophys. J. 1999, 76, 2752-2759).Such domains form coiled-coil structures comprising generic heptadrepeats designated abcdefg, where the d positions are occupiedpredominantly by leucine residues. The thermodynamics (Thompson, K. S.,Vinson, C. R., Freire, E. Biochemistry 1993, 32, 5491-5496; Krylov, D,Mikhailenko, I., Vinson, C. EMBO J. 1994, 13, 2849-2861), kinetics(Wendt, H., Berger, C., Baici, A., Thomas, R. M., Bosshard, H. R.Biochemistry 1995, 34, 4091-4107; Chao, H., Houston, M. E., Grothe, S.,Kay, C. M., O'Connor-McCourt, M., Irvin, R. T., Hodges, R. S.Biochemistry 1996, 35, 12175-12185) and structural features (O'Shea, E.K., Klemm, J. D., Kim, P. S., Alber, T. Science 1991, 254, 539-544;Nautiyal, S., Alber, T., Protein Sci. 1999, 8, 84-90; Harbury, P. B.,Zhang, T., Kim, P. S., Alber, T. Science, 1993, 262, 1401-1407) ofleucine zipper peptides have been characterized extensively. Studies inwhich leucine residues at the d positions have been replaced by othernaturally occurring aliphatic amino acids have demonstrated that leucineis the most effective amino acid in terms of stabilization of thecoiled-coil structure (Moitra, J., Szilak, L., Krylov, D., Vinson, C.Biochemistry 1997, 36, 12567-12573; Hodges, R. S., Zhou, N. E., Kay, C.M., Semchul, P. D. Peptide Research, 1990, 3, 125-137). In fact, leucineis the most abundant amino acid in cellular proteins spanning a widerange of organisms (Creighton, T. E. Proteins Structures and MolecularProperties; W.H. Freeman and Company: New York, 1993). Leucine-enrichedhydrophobic cores are important in driving protein folding anddetermining protein stability in a large number of proteins (Lubienski,M. J., Bycroft, M., Freund, S. M. V., Fersht, A. R. Biochemistry 1994,33, 8866-8877; Hill, C. P., Osslund, T. D, Eisenberg, D. Proc. Natl.Acad. Sci. USA 1993, 90, 5167-5171).

Previous examples of employing other natural amino acids as an attemptto replace leucine have all resulted in loss in coiled coil stability(Moitra, J.; Szilak, L.; Krylov, D.; Vinson, C. Biochemistry 1997, 36,12567-12573; Hodges, R. S.; Zhou, N. E.; Kay, C. M.; Semchul, P. D.;Peptide Research, 1990, 3, 125-137). This is largely due to the factthat these substitutions are usually the “large” to “small” type and canresult in loss of protein hydrophobic core packing efficiency (Sandberg,W.; Terwilliger, T. Science 1989, 245, 54-57; Baldwin, E.; Xu, J.;Hajiseyedjavadi, 0.; Baase, W. A.; Matthews, B. W. J. Mol. Biol. 1996,259, 542-559; Kano, H.; Nishiyama, M.; Tanokura, M.; Doi, J. ProteinEng. 1998, 11, 47-52). Protein cores are believed to be tightly packedand require a jigsaw puzzle-like arrangement of different residue sidechains (Harpaz, Y., Gerstein, M.; Chothia, C. Structure 1994, 2,641-649; Richards, F. M., Lim, W. A. Q. Rev. Biophys. 1994, 15, 507-523;Levitt, M., Gerstein, M., Huang, E., Subbiah, S., Tsai, J. Annu. Rev.Biochem. 1997, 66, 549-579). Thus any perturbation with amino acids ofslight difference in geometry can result in substantial energetic cost.

The present invention provides a unique strategy to systematicallytarget the hydrophobic core region(s) of proteins, wherein naturallyoccurring hydrophobic amino acids are replaced with hyper-hydrophobicnon-natural amino acids, resulting in the creation of novel artificialpolypeptides, which are identical to the corresponding natural proteinsin their tertiary structure and function, but offer an additionaladvantage of increased stability relative to the corresponding wild typeproteins.

SUMMARY OF THE INVENTION

The present invention provides methods to improve the stability ofproteins by incorporating one or more non-natural amino acids into thehydrophobic core region(s) of existing protein structures. The thermaland chemical stability of such proteins having the non-natural aminoacids is significantly improved compared to those of correspondingwild-type proteins.

The invention further provides purified leucine zipper and coiled-coilpolypeptides in which the leucine residues have been replaced with5,5,5-trifluoroleucines, and the modified proteins so produced. Theseproteins demonstrate increased thermal and chemical stability comparedto their corresponding wild type natural proteins.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 a shows: Amino acid sequence of GCN4-pl.

FIG. 1 b shows the structures of leucine, trifluoroleucine (Tfl) andhexafluoroleucine (Hfl).

FIG. 1 c shows three-dimensional representation of the dimeric GCN4-plsubstituted with trifluoroleucine at the four d-positions in the helix.

FIG. 2 a shows CD spectra of Leu-GCN4-pl (□) and Tfl-GCN4-pl (◯) at 0°C. and 30 μM.

FIG. 2 b shows thermal unfolding profiles for Leu-GCN4-pl (squares) andTfl-GCN4-pl (circles) at 30 μM (open symbols) and 85 μM (close symbols).

FIG. 2 c shows concentration dependence of thermal melting temperatureof Leu-GCN4-pl (□) and Tfl-GCN4-pl (◯).

FIG. 2 d shows Guanidinium hydrochloride (GuHCl) titration ofLeu-GCN4-pl (squares) and Tfl-GCN4-pl (circles). The difference in theability to these molecules to resist denaturation is examined at 30° C.(open symbols) and 50° C. (closed symbols). The ellipticity is monitoredat 222 nm and with a peptide concentration of 30 μM. Insert: GuHCltitration midpoint concentration, as defined by the concentration ofGuHCl that is needed to denature 50% of the peptides, are plotted as afunction of temperature, as described in Examples 5.

FIG. 3 a shows CD spectra for Leu-bZip (squares) and Tfl-bZip (circles)without CREB DNA binding sequence (open symbols) and with DNA (closedsymbols) at 0° C., as described in Example 6.

FIG. 3 b shows mobility shift assay of Leu-bZip and Tfl-bZip binding tooligonucleotides containing the AP-1 binding site(5′-GTGGAGATGACTCATCTCCGG-3′, top), the CREB binding site(5′-TGGAGATGACGTCATCTCCT-3′, middle) and the nonspecific sequence (NON,5′-GATCCCAACACGTGTTGGGATC-3′, bottom), as described in Example 6.

FIG. 4 a shows Amino acid sequence of a leucine-zipper peptidedesignated A1. The leucine positions are highlighted in bold.

FIG. 4 b shows western blot analysis of A1 expression by E. coli. Lane1: uninduced sample; lane 2: induced sample without supplementation;lane 3: induced sample supplemented with leucine; lane 4: induced samplesupplemented with 2; lane 5: induced sample supplemented with 3, asdescribed in Example 7.

FIG. 5 shows the results of varying the concentration of leucine in theexpression medium while holding the concentration of trifluoroleucineconstant at 100 mg/L on the extent of fluorination in A1, as describedin Example 7. Normal leucine concentration in expression medium is 40mg/L. The extent of incorporation is determined by amino acid analysis.

FIG. 6 shows CD spectra of A1 () and FA1-92 (◯) at 0° C. (10 μM proteinconcentration, PBS buffer, pH 7.4) as described in Example 7. Bothproteins are highly helical as suggested by the ellipticity at 222 nm.The overlap of the spectra indicates identical secondary structures.

FIG. 7 shows the results of a thermal denaturing experiment onfluorinated A1 proteins with different level of Tfl incorporation asdescribed in Example 7. Insert: The thermal melting temperature (T_(m))plotted as a function of level of incorporation. T_(m) is defined as thetemperature at which 50% of the peptide has unfolded. The stability ofthe protein increases with increasing level of 2-trifluoroleucinesubstitution. (10 μM protein concentration, PBS buffer, pH 7.4).

FIG. 8 shows the results of urea titration of A1 and fluorinated A1 at0° C. demonstrating that the chemical stability is also improved uponTfl incorporation as shown in Example 7. The fraction of unfoldedprotein is plotted against increasing urea concentration. Insert: Theurea concentration at which 50% of protein is denatured is plottedagainst the extent of fluorination. (10 μM protein concentration, PBSbuffer, pH 7.4).

DETAILED DESCRIPTION OF THE INVENTION Definitions

As used in this application, the following words or phrases have themeanings specified.

As used herein, “natural” or “wild type” refers to a protein or apolypeptide, which is found in nature, and “artificial” refers to aprotein or a polypeptide that comprises non-natural sequences and/oramino acids.

As used herein, the term “non-natural amino acid” refers to an aminoacid that is different from the twenty naturally occurring amino acids(alanine, arginine, glycine, asparagine, aspartic acid, cysteine;glutamine, glutamic acid, serine, threonine, histidine, lysine,methionine, proline, valine, isoleucine, leucine, tyrosine, tryptophan,phenylalanine) in its side chain functionality.

As used herein, the term “hyper-hydrophobic” means that the non-naturalamino acid is more hydrophobic than the corresponding natural aminoacid. The examples of hyper-hydrophobic amino acids includetrifluoroleucine, hexafluoroleucine, didehydroleucine, trifluorovaline,hexafluorovaline.

As used herein, the term “T_(m)” means the temperature at which 50% ofthe peptide has unfolded.

As used herein, the term “C_(m)” means the detergent concentration atwhich 50% of the peptide has unfolded.

Method of the Invention

The present invention provides methods to improve the stability ofproteins by incorporating one or more non-natural amino acids into thehydrophobic core region(s) of existing protein structures. Improvedstability refers to the presence of a higher ratio of folded to unfoldedprotein relative to that of the wild type protein. Improved stabilitycan be determined by examining the amount of folded protein presentunder varying conditions of temperature, detergent, and pH.

Protein folding is driven by a variety of interactions includingdesolvation, H-bonding, electrostatic interaction and van der Waalforces, and there is a general tendency of protein structures to burythe hydrophobic amino acids away from water. By changing the naturallyoccurring hydrophobic amino acids to hyper-hydrophobic amino acidanalogs, it is believed that the proteins' tendency to remain foldedwill increase. This strategy can be applied to almost any proteinbecause all protein folding are driven by this tendency. Proteins thatcould be evaluated for increased stability using this approach includeproteins containing leucine-zipper domains, membrane proteins, cytokinesand enzymes (M. Roux, F. Nezil, M. Monck, M. Bloom, Biochemistry 33, 3071994). Proteins such as small membrane peptides that rely on hydrophobicside chains to form ion channels in membranes may particularly benefitfrom increased membrane or inter-peptide association offered by themethods of this invention. Furthermore, this approach will be especiallyuseful for designing enzymes that can function in non-aqueous medium,such as in organic solvents (Gladilin, A. K., Levashov, A. V.,Biochemistry (Most), 1998, 63, 345-356: Gupta, M. N., Europ. J. Biochem,1992, 203, 25-32).

The non-natural amino acids incorporated into polypeptides using themethod of this invention are different from the twenty naturallyoccurring amino acids in their side chain functionality. The non-naturalamino acid can be a close analog of one of the twenty natural aminoacids, or it can introduce a completely new functionality and chemistry,as long as the hydrophobicity of the non-natural amino acid is eitherequivalent to or greater than that of the natural amino acid. Thenon-natural amino acid can either replace an existing amino acid in aprotein (substitution), or be an addition to the wild type sequence(insertion). The incorporation of non-natural amino acids can beaccomplished by known chemical methods including solid-phase peptidesynthesis or native chemical ligation, or by biological methods such as,but not limited to, in vivo incorporation of the non-natural amino acidby expression of the cloned gene in a suitable host.

In a preferred embodiment, the non-natural amino acids used are thefluorinated amino acids including trifluoroleucine andhexafluoroleucine. In one embodiment, by replacing leucine with e.g.,trifluoroleucine in a polypeptide of the leucine zipper family, leucinezipper peptides and proteins gain stability with respect to thermal andchemical denaturation. The choice of fluorinated amino acids is based onseveral factors, the most important of which is the observation thatmany fluorocarbons behave as though they are more hydrophobic than theirhydrocarbon analogs (Gough, C. A.; Pearlman, D. A.; Kollman, P. J. Chem.Phys. 1993, 99, 9103-9110; Hine, J., Mookejee, P. K. J. Org. Chem. 1975,40, 292-297). Second, because the trifluoromethyl group is chemicallyinert and nearly isosteric to the methyl group, its insertion into thehelical interface does not disrupt the arrangement of the hydrophobicpocket around what was previously a methyl group (Kukhar, V. P.,Soloshonok, V. A. Fluorine Containing Amino Acids—Synthesis andProperties; John Wiley & Sons: Chichester, 1995), suggesting thatproteins and peptides outfitted with fluorinated amino acids might adoptfolded structures similar to those of their corresponding “wild type”proteins.

In a specific example, using two α-helical polypeptides, GCN4-pl, and A1as model peptides, protein stability was significantly increased byincorporating the hyper-hydrophobic non-natural amino acids. Theproteins used in this example, are of the leucine-zipper family, inwhich stability and folding are highly dependent on the core leucineresidues. From the X-ray crystal structure analysis of the GCN4-plleucine zipper peptide, it is known that the branched side chains onopposing leucine residues in a leucine zipper are in a side by sideconfiguration (O. M. Rennert, H. S. Anker, Biochemistry 2, 471, 1963; S.Kothakota, M. J. Dougherty, M. J. Fournier, T. L. Mason, D. A. Tirrell,Macromol Symp. 98, 573, 1995). The methyl groups of leucines providestabilizing effects by efficiently packing with the many adjacenthydrophobic regions between the α-helices. Because trifluoromethylgroups are isosteric to methyl groups, their insertions into the helicalinterface do not disrupt the arrangement of the hydrophobic pocketaround what was previously a methyl group.

Proteins and Polypeptides of the Invention

The protein and polypeptides to be stabilized may be isolated from anysource whether natural, synthetic, semi-synthetic, or recombinant. Asdefined elsewhere in this application “natural” or “wild type” refers toa protein or a polypeptide, which is found in nature, and “artificial”refers to a protein or a polypeptide that comprises non-naturalsequences and/or amino acids.

The suitable non-natural amino acids for use in this invention include,but are not limited to, molecules having fluorinated, electroactive, andunsaturated side chain functionalities. Non-natural amino acid analogsand derivatives for leucine include but are not limited to5,5,5-trifluoroleucine, 5,5,5,5′,5′,5′-hexafluoroleucine, and2-amino-4-methyl-4-pentenoic acid. Non-natural amino acid analogs andderivatives for isoleucine include but are not limited to2-amino-3,3,3-trifluoro-methylpentanoic acid,2-amino-3-methyl-5,5,5-trifluoropentanoic acid, and2-amino-3-methyl-4-pentenoic acid. Non-natural amino acid analogs andderivatives for valine include but are not limited to trifluorovalineand hexafluorovaline. Non-natural amino acid analogs for methionineinclude but are not limited to 6,6,6-trifluoromethionine,homoallyglycine, and homoproparglycine. However, a similar strategy forincorporating the halogen containing non-natural amino acid analog forphenyalanine such as p-fluoro-phenylalanine and p-bromophenylalanine maybe less useful for protein stabilization since presence of electronwithdrawing groups such as fluorine may alter the conjugated phenyl ringof phenyl alanine.

The proteins and polypeptides that can be targeted for stabilizationusing the method of the invention are those possessing a hydrophobiccore region. These include, but are not limited to, cytokines such asinterleukins, Tumor Necrosis Factor, Granulocyte Colony StimulatingFactor, Erythropoitin, proteases such as Subtilisin, Thermolysin,industrial enzymes such as dehydrogenases, estrases. Since most of theseproteins are comprised of helix bundles, their structure can bestabilized by incorporating non-natural hyper-hydrophobic amino acidsinto the hydrophobic core region(s) of the target protein.

The protein site(s) targeted for incorporating non-natural amino acidsinclude region(s) containing hydrophobic amino acids that generallydrive protein folding. The specific hydrophobic amino acids that aretarget of this invention include leucine, isoleucine, valine, and to alesser degree methionine and phenylalanine.

The proteins of the present invention can be made either by chemicalsynthesis or by utilizing recombinant DNA technology as described in theExamples 2 and 3. The principles of solid phase chemical synthesis ofpolypeptides are well known in the art and may be found in general textsrelating to this area (Dugas, H. and Penney, C. 1981BioorganicChemistry, pp 54-92, Springer-Verlag, New York). Wild type andartificial proteins and polypeptides can be synthesized by solid-phasemethodology utilizing an Applied Biosystems 430A peptide synthesizer(Applied Biosystems, Foster City, Calif.) and synthesis cycles suppliedby Applied Biosystems. Protected amino acids, such ast-butoxycarbonyl-protected amino acids, and other reagents arecommercially available from many chemical supply houses.

Recombinant Nucleic Acid Molecules Comprising Nucleotide SequencesEncoding a Polypeptide of Interest

Also provided are recombinant nucleic acid molecules, such asrecombinant DNA molecules (rDNAs) that contain nucleotide sequencesencoding a polypeptide of the invention such as a leucine zipperprotein, or a coiled-coil protein, or fragments thereof that incorporateat least one non-naturally occurring amino acid. As used herein, a rDNAmolecule is a DNA molecule that has been subjected to molecularmanipulation in vitro. Methods for generating rDNA molecules are wellknown in the art, for example, see Sambrook et al., Molecular Cloning(1989), supra.

Vectors

The term vector includes, but is not limited to, plasmids, cosmids, andphagemids. A preferred vector for expression will be an autonomouslyreplicating vector comprising a replicon that directs the replication ofthe rDNA within the appropriate host cell. The preferred vectors alsoinclude an expression control element, such as a promoter sequence,which enables transcription of the inserted sequences and can be usedfor regulating the expression (e.g., transcription and/or translation)of an operably linked sequence in an appropriate host cell such asEscherichia coli. Expression control elements are known in the art andinclude, but are not limited to, inducible promoters, constitutivepromoters, secretion signals, enhancers, transcription terminators, andother transcriptional regulatory elements. Other expression controlelements that are involved in translation are known in the art, andinclude the Shine-Dalgarno sequence, and initiation and terminationcodons. The preferred vector also includes at least one selectablemarker gene that encodes a gene product that confers drug resistance,such as resistance to ampicillin or tetracyline. The vector alsocomprises multiple endonuclease restriction sites that enable convenientinsertion of exogenous DNA sequences.

The preferred vectors for generating the encoded “wild type” or“artificial” polypeptides are expression vectors, which are compatiblewith prokaryotic host cells. Prokaryotic cell expression vectors arewell known in the art and are available from several commercial sources.For example, a pQE vector (e.g., pQE15, available from Qiagen Corp.) maybe used to express “wild type” polypeptides, containing natural aminoacids and “artificial” polypeptides, including those containingnon-natural amino acids, in bacterial host cells.

Fusion Genes

A fusion gene includes a sequence encoding a polypeptide of theinvention operatively fused (e.g., linked) to a non-related sequencesuch as, for example, a tag sequence to facilitate isolation and/orpurification of the expressed gene product (Kroll, D. J., et al., 1993DNA Cell Biol 12:441-53). The pQE expression vectors used in thisinvention express proteins fused to a poly-Histidine tag thatfacilitates isolation and/or purification of the expressed gene.

Transformed Host Cells

The invention further discloses a host-vector system comprising thevector, plasmid, phagemid, or cosmid having a nucleotide sequenceencoding the polypeptide of invention, introduced into a suitable hostcell. The host-vector system can be used to produce the polypeptidesencoded by the inserted nucleotide sequences. The host cell can beeither prokaryotic or eukaryotic. Examples of suitable prokaryotic hostcells include bacterial strains from genera such as Escherichia,Bacillus, Pseudomonas, Streptococcus, and Streptomyces. Examples ofsuitable eukaryotic host cells include a yeast cell, a plant cell, or ananimal cell, such as a mammalian cell. A preferred embodiment provides ahost-vector system comprising the pQE15 vector having a sequenceencoding the polypeptide of invention, which is introduced along withthe pREP4 vector into an appropriate auxotroph such as E. coli leucineauxotroph SG13009 strain, which is useful, for example, for producing apolypeptide where leucine residues are replaced with a non-natural aminoacid.

Introduction of the rDNA molecules of the present invention into anappropriate cell host is accomplished by well known methods thattypically depend on the type of vector used and host system employed.For transformation of prokaryotic host cells, electroporation and salttreatment methods are typically employed, see for example, Cohen et al.,1972 Proc Acad Sci USA 69:2110; Maniatis, T., et al., 1989 MolecularCloning, A Laboratory Manual, Cold Spring Harbor Laboratory, Cold SpringHarbor, N.Y. Transformation of vertebrate cells with vectors containingrDNAs, electroporation, cationic lipid or salt treatment methods, istypically employed, see, for example, Graham et al., 1973 Virol 52:456;Wigler et al., 1979 Proc Natl Acad Sci USA 76:1373-76.

Successfully transformed cells, i.e., cells that contain a rDNA moleculeof the present invention, are identified by well-known techniques. Forexample, cells resulting from the introduction of a rDNA of the presentinvention are selected and cloned to produce single colonies. Cells fromthose colonies are harvested, lysed and their DNA content examined forthe presence of the rDNA using a method such as that described bySouthern, J Mol Biol (1975) 98:503, or Berent et at, Biotech (1985)3:208, or the proteins produced from the cell are assayed via abiochemical assay or immunological method such as Western blotting.

Recombinant methods are preferred if longer proteins, higher yield, or acontrolled degree of non-natural amino acid incorporation is desired.Recombinant methods involve expressing the cloned gene in a suitablehost cell. For example, a suitable host cell is introduced with anexpression vector having the nucleotide sequence encoding the protein ofinterest. The host cell is then cultured under conditions that permit invivo production of the desired protein, wherein one or more naturallyoccurring amino acids in the desired protein or polypeptide are replacedwith the non-natural amino acid analogs and derivatives. In manyapplications, for example, when replacing leucine with a fluorinatedamino acid analog, it may be desirable to achieve only partialincorporation of the fluorinated amino acid in the hydrophobic corebecause fully fluorinated proteins are usually obtained in lower yieldsand may compromise in activity. Therefore it is important to be able tocontrol the levels of incorporation of non-natural amino acid so thatthe protein stability, activity and yields are all optimal.

ADVANTAGES OF THE INVENTION

This invention introduces a unique strategy that can be widely appliedto stabilize the folded structure of proteins and polypeptides undernormally denaturing conditions. Proteins and polypeptides modified usingthe method of this invention exhibit higher stability under denaturingconditions such as elevated temperature, presence of denaturingchemicals, extreme solution pH and other non-physiological environments.

The method of this invention changes the building blocks of proteinsynthesis, leaving the “blueprint” encoding the proteins unchanged. Thisinvention, therefore, permits a rapid and predictable approach to designand produce proteins and polypeptides with significantly increasedstability.

This method of this invention is generally applicable to a large rangeof proteins, enzymes, and peptides, and is not limited by size orstructure. For example, the incorporation of hyper-hydrophobic aminoacids such as the fluorinated amino acids results in very minimalperturbation of the protein structure, and the inert nature of thefluorinated side-chains will leave many protein functions unchanged. Thecatalytic, signaling, or inhibitory activities of proteins containingnon-natural amino acids, should not be compromised at the expense ofincreased stability.

Incorporation of hyper-hydrophobic amino acids should be especiallyuseful for stabilizing small membrane peptides that rely on hydrophobicside chains to form ion channels in membranes and may also benefit fromincreased membrane (or inter-peptide) association upon fluorination.Furthermore, the feasibility of incorporating fluorinated amino acidsusing the in vivo methods should allow the fluorination of enzymes,signaling molecules, protein ligands, and may prove to be of broadutility in the engineering of more robust biological assemblies.

The following examples are presented to illustrate the present inventionand to assist one of ordinary skill in making and using the same. Theexamples are not intended in any way to otherwise limit the scope of theinvention.

Example 1

The following Example provides a description of how the non-naturalamino acid analogs trifluoroleucine (Tfl) and hexafluoroleucine (Hfl)were prepared.

Trifluoroleucine (Tfl, FIG. 1B) was synthesized in an overall yield of22% in seven steps starting from β-trifluoromethylcrotonic acid (OakwoodChemical, Columbia, S.C.), according to the procedure of Rennert et al.with slight modifications (Rennert, O. M.; Anker, H. S. Biochemistry1963, 2, 471). DL-trifluoroleucine prepared by this method as theN-acetylated racemic mixture was resolved to L-trifluoroleucine bytreatment with porcine kidney acylase (Sigma) to >99% enantiomericexcess (e.e). (Chenault et al. J. Am. Chem. Soc. 111, 6354-6364, 1989).The yield for the resolution was 67%. The determination of e.e. wasaccomplished by ¹H NMR spectroscopy following derivatization withMosher's acid, R-(+)-methoxytrifluoromethylphenylacetic acid.

Hexafluoroleucine (Hfl, FIG. 1B) was prepared by modification of theprocedures reported by Zhang et al. (Zhang, C.; Ludin, C.; Eberle, M.K.; Stoeckli-Evans, H.; Keese, R. Helv. Chim. Acta 1998, 81, 174-181).

Example 2

The following Example provides a description of the chemical synthesisof “wild type” (Leu-GCN4-pl) and trifluoleucine incorporated(Tfl-GCN4-pl) forms of the leucine zipper peptide Leu-GCN4-pl.

The amino acid sequence of GCN4-pl is shown in FIG. 1A. Both the “wildtype” (Leu-GCN4-pl) and fluorinated (Tfl-GCN4-pl) forms of the leucinezipper peptide GCN4-pl were synthesized at the Biopolymer SynthesisCenter at the California Institute of Technology (Pasadena, Calif.91125). Automated, stepwise solid-phase synthesis was performed on anABI 433A synthesizer employing Fmoc chemistry. To prepare thefluorinated peptide (Tfl-GCN4-pl), N-Fmoc-5,5,5-trifluoro-L-leucineprepared as described in Example 1 was used as an equimolar mixture ofthe 2S,4S- and the 2S,4R-isomers, and incorporated into the peptide withextended coupling cycles. After chain assembly was complete, the peptidewas deprotected and removed from the resin support with trifluoroaceticacid in the presence of 1,2-ethanedithiol, thioanisole and water.Peptides were precipitated into cold methyl t-butyl ether and isolatedby centrifugation. Peptide products were purified by preparative C₁₈reverse phase HPLC using a non-linear gradient of 0-80% elution solution(0.1% TFA/60% acetonitrile/40% H₂O) in 120 min. Neither Leu-GCN4-pl norTfl-GCN4-pl is acylated at the N-terminus; hence the thermal meltingtemperature of Leu-GCN4-pl is lower than that reported for acylatedGCN4-pl. After HPLC purification, the molar mass of Tfl-GCN4-pl wasconfirmed to be 4213 Da, 216 mass units higher than that of Leu-GCN4-pl.

Example 3

The following Example provides a description of the procedure used toexpress in E. coli cells, the “wild type” proteins and the corresponding“artificial” proteins of the invention, in which the leucine residuesare replaced with a non-natural fluorinated amino acid.

Analog Incorporation Assay.

The expression vector pQE-A1, which contains the coding sequences forthe protein A1 (FIG. 4A) was obtained from US Army Natick RD&E Center(Natick, Mass.). The E. coli leucine auxotroph SG13009 was obtained fromQiagen (Chatsworth, Calif.) and transformed with plasmids pREP4 andpQE-A1, to yield the expression host LAE-A1.

M9AA medium (30 ml) supplemented with 1 mM MgSO₄, 1 mM CaCl₂, 20 wt %glucose, 1 mg/L thiamin and the antibiotics ampicillin (200 mg/L) andkanamycin (25 mg/L) were inoculated with 1 ml of an overnight 2×YTculture of the expression strain. After the culture had grown to anOD₆₀₀ of 1.0 at 37° C., the cells were collected by centrifugation at5,000 g for 10 min at 4° C. The supernatant was removed and the cellpellets were washed with 0.9% NaCl and sedimented (5000 g, 10 min, 4°C.). The washing and sedimentation steps were repeated three times toremove residual leucine. The washed cells were then resuspended in 31 mlof supplemented M9AA medium, without leucine. Aliquots (5 ml) were addedto test tubes containing no leucine (negative control), L-leucine (20mg/L, positive control), DL-trifluoroleucine (40 mg/L) orDL-hexafluoroleucine (40 mg/L). The cultures were grown for 10 min at37° C., and isopropyl-β-D-thiogalactopyranoside (IPTG) was added to afinal concentration of 1 mM. The cultures were grown for 3 hours andcells were collected by sedimentation (13,000 g, 1 min, 4° C.). Cellpellets were resuspended in Buffer A (8 M urea, 0.1 M NaH₂PO₄, 0.01 MTris, pH 8.0) and were frozen immediately. The whole cell lysate wasanalyzed by 15% SDS-PAGE. The proteins were detected by western blottingwith an antibody specific for the N-terminal His-Tag of A1 (FIG. 4B).

Protein Expression and Purification.

M9 medium (1 L) supplemented as described above was inoculated with 30ml of fresh overnight culture of the expression strain. After theculture had grown to OD₆₀₀ of 1.0, it was subjected to centrifugationand washing procedures as described above. The cell pellet was suspendedin 1 L M9 medium supplemented with trifluoroleucine (100 mg/L) andleucine in appropriate concentrations. IPTG was added after 10 min toinduce protein expression. The expression of protein was monitored byremoving an aliquot (1 ml) of the culture every hour and analyzing bySDS-PAGE. Cells were collected after 3 hr by centrifugation (5000 g, 15min, 4° C.). The pellets were resuspended in 20 ml of buffer A andstored at −80° C. overnight. The cells were thawed rapidly at 37° C.,cell debris was sedimented (22,500 g, 50 min, 4° C.), and thesupernatant was applied to Ni-NTA column (1 cm×5 cm) (The QiagenExpressionist, Purification Procedure, 1992, pp 45). The column waswashed with 25 ml each of buffer A at pH 8.0, 6.5 and 5.9, sequentially.The target protein was eluted at pH 4.5. Fractions containing proteinwere combined and dialyzed (Spectra/Por membrane, MWCO at 3.5 kDa)against sterile water for 3 days. The dialysate was lyophilized to yieldpure A1. The purity of the protein was examined by SDS-PAGE (15%) (FIG.4B). Extent of trifluoroleucine incorporation was determined by aminoacid analysis (DNA/Protein Analysis Facility, Cornell University) andMALDI mass spectrometry (Protein/Peptide Micro Analytical Laboratory,Caltech).

Example 4

The following Example describes procedures used for the biochemicalcharacterization of wild-type proteins and the corresponding artificialproteins of the invention in which the leucine residues were replacedwith non-natural amino acids.

Ultracentrifugation.

Sedimentation equilibrium analysis of the wild type proteins GCN4-pl andA1 and the corresponding artificial proteins containing the non-naturalamino acids was performed using a Beckman XLI analyticalultracentrifuge, recording interference data and radial absorbance at236 and 280 nm at the same time. Initial peptide concentrations rangedbetween 100 and 300 μM; buffer was 0.01 M sodium phosphate, pH 7.4,containing 0.1 M NaCl. The samples were centrifuged at 35000, 40000,45000 rpm, until equilibrium was reached. Partial specific volumes werecalculated by the residue-weighted average method of Cohn and Edsall(Harding, S. E.; Rowe, A. J.; Harton, J. C. Analytic Ultracentrifugationin Biochemistry and Polymer Science; The Royal Society for Chemistry:Cambridge, 1992). Solution densities were estimated using soluteconcentration-dependent density tables in the CRC Handbook of Chemistryand Physics. The data were fit as single species to provide an estimateof the aggregation states. Curve fitting of analytical ultracentrifugedata was done using Igor Pro (Wavemetrics Inc., Oswego WS) withprocedures adapted from Brooks et al. (Brooks, I. S.; Soneson, K. K.;Hensley, P. Biophys. J. 1993, 64, 244) by Dr. James D. Lear.

Circular Dichroism (CD) Analysis.

CD spectra were recorded on an Aviv 62DS spectropolarimeter (Lakewood,N.J.) in PBS buffer, pH 7.4. All peptide concentrations were determinedby amino acid analysis of a stock solution (5 mg/ml). Experiments wereperformed in a rectangular cell with pathlength of 1 cm for lowconcentration samples and pathlength of 1 mm for high concentrationsamples. Spectra were scanned from 250 nm to 200 nm with data takenevery 1 nm. The temperature of the solution was maintained by athermostatically controlled cuvette holder (HP model 89101A).Temperature scans were performed from 0° C. to 100° C. in 1° C. steps.Three scans were performed on a single sample and averaged. Each datapoint was collected after 30 seconds of thermal equilibration at thedesired temperature. Urea titration scans were performed manuallystarting with the 8 M concentrated sample followed by serial dilution tothe desired urea concentration. At each concentration, the sample wasallowed to equilibrate for 5 min before recording the CD signal.

The analysis of CD thermal melting data was performed according to apreviously described procedure using a two-state model (Schneider, J.P.; Lear, J. D.; DeGrado, W. F. J. Am. Chem. Soc. 1997, 119, 5742-5743).Temperature-dependent ellipticity data for each protein at differentconcentrations were fitted globally using a non-linear least-squaresfitting procedure supplied with the Origin 6.0 software. For wild-typeA1, the concentrations used for curve fitting were 10 and 100 μM, while2 and 100 μm were used to fit data for the 92% fluorinated A1. Thethermodynamic quantities T_(m), ΔH_(m), ΔC_(p) and K_(d) were parametersof the fitting procedure and are reported in the 1 M standard state. Thefree energy of folding at any temperature is given by equation 1.

ΔG°=ΔH_(m)(1−T/T_(m))−ΔC_(p)((T_(m)−T)+T In(T/T_(m))  (1)

Example 5

The following Example provides interpretation of results of thebiochemical analyses of wild type GCN4-pl peptide and the correspondingartificial Tfl-GCN4-pl peptide in which the four leucines were replacedwith non-natural amino acids.

Secondary structures of wild type polypeptide Leu-GCN4-pl and thecorresponding artificial polypeptide Tfl-GCN4-pl were analyzed bycircular dichroism (CD). The CD spectra of both peptides indicated highhelical content as evidenced by double minima at 222 nm and 208 nm (FIG.2A). The spectra of the wild type and fluorinated peptides areessentially coincident, suggesting nearly identical secondarystructures; both peptides are highly helical at 0° C. This confirms theproposal that replacement of leucine with Tfl would not disruptinterhelical packing and interfere with folding of the coiled-coilstructure. Ultracentrifugation indicated that Tfl-GCN4-pl ispredominantly dimeric at the concentrations of interest in this work.Data for Tfl-GCN4-pl were fit to a monomer-dimer-trimer equilibrium,giving values of K_(d)'s the order of 10⁻⁸ M and 10⁻¹⁴ M², respectively,for the monomer-to-dimer and monomer-to-trimer equilibria. In theconcentration range of approximately 10 μM-40 μM the protein isapproximately 85% dimeric.

The thermal stabilities of the coiled-coil dimers of Leu-GCN4-pl andTfl-GCN4-pl were examined by CD spectroscopy (FIG. 2B). A significantincrease in the thermal stability of Tfl-GCN4-pl as compared toLeu-GCN4-pl is reflected in an elevation of the thermal denaturationtemperature from 48° C. to 61° C. The 13° C. increase in T_(m) isremarkable in view of the fact that no increase in the thermal stabilityof GCN4-pl has been reported based solely on substitution of the leucineresidues at the d-positions. Mutations at the d-positions to othernatural amino acids have all resulted in losses in helix stability dueto decreases in packing efficiency since such changes are usually of the“large to small” type (J. Moitra, L. Szilak, D. Krylov, C. Vinson,Biochemistry 36, 12567, 1997). The similarity in the melting curves ofthe two peptides suggests similarly cooperative unfolding of the dimericstructure to unfolded monomers. As expected for a monomer-dimerequilibrium, the denaturation curves depend on the peptideconcentrations, and their midpoints shift to higher temperature as theconcentrations of peptides are increased (FIG. 2C). The thermodynamicchanges associated with the transition (folded dimer to unfoldedmonomers) is calculated from the melting curves by fitting the data to amonomer-dimer equilibrium (J. P. Schneider, J. D. Lear and W. F.DeGrado, J. Am. Chem. Soc. 119, 5742, 1997). Global analysis of thethermal unfolding curves at two different concentrations, approximately85 μM and 3 μM, gave a calculated ΔH° of 60.2 kcal mol⁻¹, T_(m), of385.4 K and ΔC_(p) of 530 cal mol⁻¹ K⁻¹ for Tfl-GCN4-pl (1 M standardstate); the corresponding values for Leu-GCN4-pl are 70.3 kcal mol⁻¹,365.6 K and 740 cal mol⁻¹ K⁻¹ respectively. Under all conditions where adirect experimental comparison was possible, Tfl-GCN4-pl was 0.5˜1.2kcal mol⁻¹ more stable than Leu-GCN4-pl; for example, at 50° C., theK_(d) of dimerization was 67.8 μM for Leu-GCN4-pl and 9.8 μM forTfl-GCN4-pl.

The stability of Tfl-GCN4-pl toward denaturation by chaotropic reagentswas demonstrated through guanidine hydrochloride (GuHCl) titrationexperiments (FIG. 2D). At each temperature examined, the fluorinatedpeptide displayed significantly lower susceptibility toward denaturationby of GuHCl; in each case, the concentration of GuHCl needed to unfold50% of the peptide was higher for Tfl-GCN4-pl than that for thewild-type peptide.

To determine the origins of the stabilizing effect of side-chainfluorination, molecular dynamics (MD) calculations were performed usingthe Poisson-Boltzmann (PB) continuum description of the solvent (K. T.Lim, S. Brunett, M. Iotov, 13. McClurg, N. Vaidehi, S. Dasgupta, S.Taylor, and W. A. Goddard III, J. Comp. Chem. 18, 501, 1997) whichincludes the Cell Multipole Method (H. Q. Ding, N. Karasawa, W. A.Goddard III, J. Chem. Phys. 97, 4309, 1992; D. J. Tannor, B. Marten, R.Murphy, R. A. Friesner, D. Sitkoff, A. Nicholls, M. Ringnalda, W. A.Goddard III, J. Am. Chem. Sac., 116, 11875, 1994; A. Ghosh, C. S. Rapp,R. A. Friesner, J. Phys. Chem. B, 102, 10983, 1998; D. J. Tannor, B.Marten, R. Murphy, R. A. Friesner, D. Sitkoff, A. Nicholls, M.Ringnalda, W. A. Goddard III, J. Am. Chem. Soc., 116, 11875, 1994; A.Ghosh, C. S. Rapp, R. A. Friesner, J. Phys. Chem. B, 102, 10983, 1998).The PB description of solvation implicitly includes entropic changes inthe solvent, thus the calculations lead directly to the binding freeenergies (ΔG^(BE)). The MPSIM MD program and the DREIDING Force Field(FF) were used for all calculations.

The starting structure for the Leu-GCN4-pl dimer was taken from the RCSBProtein Data Bank; those of the fluorinated dimers were derived from thenative dimer structure by replacement of the appropriate methylhydrogens with fluorines, followed by re-optimization of the structure.Because the γ-carbon of Tfl is asymmetric (FIG. 1B), multiplearrangements of adjacent diastereotopic trifluoromethyl groups had to beconsidered.

When both Tfl residues at a given d-position are of the (2S,4S)configuration, the two trifluoromethyl groups are relatively close toone another; the fluorinated carbon centers are separated by ca. 6 Å. Onthe other hand, when two (2S,4R) isomers are juxtaposed, thecorresponding carbon-carbon distance increases to about 8 Å. In theremaining configurations (where the two strands carry differentisomers), the trifluoromethyl groups are separated by intermediatedistances.

To determine how side-chain stereochemistry affects dimer stability,simulations were performed on all configurations. For simulation ofstrands containing different stereoisomers of Tfl, only those cases inwhich all four Tfl on one strand have the same stereoconfiguration wereconsidered.

From the 1 ns trajectory, the average properties were calculated over800 ps after equilibration. ΔG^(BE) was calculated as the difference inenergies of the solvated dimer and the corresponding solvated monomers.Table 1 reports the average values of ΔG^(BE) (per monomer) for thenative and fluorinated forms. ΔG^(BE) is the difference in energy(averaged over 800 ps of MD after equilibration) of solvated monomersand the solvated dimer each from separate SGB MD calculations (finalsolvation energies with PBF). ΔG^(BE) is quoted per mole of the monomer.% increase is the increase in ΔG^(BE) compared to the Leu-GCN4-plstructure. Also shown is the % helicity of each.

TABLE 1 Binding free energies (ΔG^(BE), kcal/mol) of Leu-GCN4-p1 andfluorinated dimers. Structure ΔG^(BE) % increase % helicity^(b)Leu-GCN4-p1 65.08 0 90.8 ^(a)Close (4S,4S) 93.75 44 84.3 Far (4R,4R)98.14 51 79.4 Mixed (4S,4R) 99.20 52 81.1 Mixed (4R,4S) 111.15 71 89.3Tfl-average 100.56 55 83.5 Hfl-GCN4-1p 77.21 19 78.5 ^(a)Close, Far,Mixed: Configuration of the pair of trifluoromethyl groups asillustrated in FIG. 4. Tfl-average: The averaged ΔG^(BE) of the fourconfigurations. ^(b)Helicity quoted here was calculated as the ratio ofthe residues with torsion angles φ and ψ in the helical region of theRamachandran plot to the total number of residues in the protein.

The Tfl-GCN4-pl dimers are predicted to exhibit ΔG^(BE) ca. 55% largerthan that of the leucine form (calculated relative to the respectiverandom coil monomers). The various stereochemical arrangements lead toincreases in binding energies ranging from 44% to 71%, indicating thatside-chain configuration may have some differential effect on dimerstability. Similar calculations for the hexafluoroleucine (Hfl) dimerleads to the prediction that such dimers will be significantly lessstable than the Tfl dimers but marginally more stable (19%) than thewild type.

To investigate the source of stability of the fluorinated dimers, thecomponents of the binding energy for each peptide were analyzed (Table2). The primary driving forces for stabilizing the Tfl-GCN4-pl dimersarise from van der Waals (vdW) and hydrogen bonding interactions. Thesolution structures of Hfl and wild type monomers are globular, whilethe Tfl-GCN4-pl monomer is more extended. This is due to the Tfl sidechains, which produces local “kinks” due to favorable electrostaticinteractions. These “kinks” stabilize the more extended form for themonomer of Tfl-GCN4-pl. For example a hairpin is formed from favorableinteractions between Tfl₅ and Tfl₁₂ in the monomer of Tfl-GCN4-pl. Thewild type and Hfl monomers, on the other hand, do not form local “kinks”but instead fold into globular structures. These structures have morenon-local hydrogen bonds and more favorable vdW contacts than the moreextended Tfl-GCN4-pl monomer. Hence, the gain in H-bond and vdW energiesin forming a dimer is greater for Tfl-GCN4-pl than for the wild-type orHfl peptides because the latter peptides must pay an energy cost tounfold before dimerization.

TABLE 2 Components of ΔG^(BE) (kcal/mol) for Leu-GCN4-p1 and fluorinateddimers (quoted for one mole of the monomer) Structure ΔG^(valence)ΔG^(coulomb+solvation) ΔG^(vdW) ΔG^(Hbond) Leu-GCN4-p1 −16.12 −16.6641.82 56.05 ^(a)Close (4S,4S) −8.46 −16.64 59.56 59.29 Far (4R,4R) −9.54+1.73 42.16 63.80 Mixed1 (4S,4R) −5.36 −10.55 65.17 49.94 Mixed2 (4R,4S)−27.96 −1.79 48.29 92.61 Tfl-average −12.83 −6.82 53.80 66.41Hfl-GCN4-p1 −23.06 7.24 36.51 56.19 ^(a)Close, Far, Mixed: Configurationof the pair of trifluoromethyl groups as illustrated in FIG. 4.Tfl-average: The averaged ΔG of the four configurations.

Consideration of electrostatic (intra- and inter-peptide coulomb forces)and solvation interactions suggests that the hydrophobic preference inthe dimer for burial of CF₃ is greater than for CH₃. Considering justcoulomb and solvation interactions, the driving force for dimerizationis predicted to decrease in the order Hfl>Tfl>Leu. It is the balance ofdesolvation, electrostatics, H-bonding and vdW forces that leads to theprediction that the Tfl dimers are more stable than the Hfl dimer, whichin turn is more stable than the native leucine dimer. The averagehelicity of dimers is predicted to be 90.8% for Leu-GCN4-pl, 83.5% forTfl, and 78.5% for Hfl.

These results demonstrate that the subtle change from four leucinemethyl groups to four trifluoromethyl groups results in a large gain instability of the folded structure. It is remarkable that for a smallpeptide of the size of GCN4-pl, fluorination results in a modifiedcoiled-coil structure that is highly resistant to both thermal anddenaturant unfolding as compared to the wild-type peptide.

Example 6

The following example describes biochemical analysis of wild type bZippeptide peptide and the corresponding artificial bZip peptide in whichthe leucines were replaced with non-natural amino acids.

To investigate the effects of fluorination on the biological activitiesof coiled-coil proteins, a fluorinated DNA binding protein Tfl-bZip wasconstructed. The wild-type protein Leu-bZip is a 56 amino acid segment(residues 226-281) of the eukaryotic transcription factor GCN4. The Nterminus of Leu-bZip contains the DNA recognition domain that is rich inbasic residues such as lysine and arginine. The C-terminus subdomain ofLeu-bZip contains the GCN4-pl peptide segment and facilitates thedimerization of the protein. While the direct contact between the Nterminus residues with DNA is important to the recognition betweenprotein and DNA, the specific protein-protein interactions at the Cterminus are also responsible for the specificity and affinity betweenprotein and DNA (K. Arndt and G. R. Fink, Proc. Natl. Acad. Sci. USA,83, 8516, 1986; Y. Aizawa, Y. Sugiura, M. Ueno, Y. Mori, K. Imoto, K.Makino and T. Morii, Biochemistry 38, 4008, 1999; S. C. Hockings, J. D.Kahn and D. M. Crothers, Proc. Natl. Acad. Sci. USA 95, 1410, 1998).Tfl-bZip contains Tfl-GCN4-pl at its dimerization domain while the DNAbinding domain is unchanged. The secondary structures of the twoproteins were identical as confirmed by CD and the thermal meltingtemperature of the Tfl-bZip was elevated by 8° C. compared to Leu-bZipat 10 μM protein concentration.

The DNA binding domain of Leu-bZip is in a random coil conformation inthe absence of specific DNA sequences, while the dimerization domainforms a dimeric coiled coil through the leucine zipper motif atconcentrations above the monomer-to-dimer equilibrium (M. A. Weiss, T.Ellenberger, C. R. Wobbe, J. P. Lee, S. C. Harrison and K. Struhl,Nature 346, 575, 1990). Upon recognition of DNA, the DNA binding regionfolds into an α-helical structure and the protein binds to the DNA in a“chopstick” model (T. Ellenberger, C. J. Brandi, K. Struhl and S. C.Harrison, Cell 71, 1223, 1992; P. Konig and T. J. Richmond, J. Mol.Biol. 233, 139, 1993; W. Keller, P. Konig and T. J. Richmond, J. Mol.Biol. 254, 657, 1995). CD analysis of the Tfl-bZip secondary structurerevealed that the fluorinated protein behaves the same way as Leu-bZip(FIG. 3A). Before the addition of oligonucleotides containing the CREBbinding site, Tfl-bZip is approximately 70% helical, corresponding toits helical dimerization domain. After the addition of oligonucleotides,Tfl-bZip turned into almost 100% α-helical, indicating a transition ofthe binding region from coil to helix after DNA recognition. The similarchange in protein structure observed for Leu-bZip and Tfl-bZip confitnisthat the fluorination of the zipper domain does not affect therecognition and association between protein and DNA.

The affinity and specificity of the fluorinated protein binding torecognition sequences were shown to be identical to the wild typeprotein by gel-retardation assays (S. J. Metallo and A. Schepartz, Chem.Biol. 1, 143, 1994; D. N. Paolella, C. R. Palmer and A. Schepartz,Science 264, 1130, 1994). (FIG. 3B) Leu-bZip binds to both the AP-1 andCREB binding sites with equal affinities even though the spacing betweenthe half-sites of DNA is different. Densitometry analysis of themobility shift assay-revealed that Tfl-bZip binds to both sequences withequal specificity (K_(d)=12.5±0.7 nM for AP-1 and 5.4±0.6 nM for CREB)and identical affinity compared to that observed for Leu-bZip (Kd=12.8±1nM for AP-1 and 4.8±0.5 nM). Neither protein recognizes nonspecific(NON) sequences as shown by the lack of protein-bound DNA in the assaycontaining NON oligonucleotides.

Example 7

The following example describes biochemical analysis of wild type A1peptide and the corresponding artificial peptide of the invention inwhich the leucines were replaced with non-natural amino acids.

The A1 protein (FIG. 4A) forms dimeric coiled coils in aqueous solution.It has been previously used as an element of artificial multidomainproteins that form reversible hydrogels under conditions of controlledpH and temperature (Petka, W. A.; Harden, J. L.; McGrath, K. P.; Wirtz,D.; Tirrell, D. A. Science 1998, 281, 389-392). The A1 protein containseight leucine residues, of which six are distributed at the d positionsof the six heptad repeats. By using a leucine auxotrophic strain of E.coli, trifluoroleucine-substituted A1 was prepared at levels offluorination that ranged from 17% to 92%. The thermal and chemicalstabilities of the fluorinated proteins were significantly elevatedcompared to those of the wide type A1 protein. Hfl does not supportmeasurable protein synthesis in E. coli under the conditions examined inthis example.

The ability of Tfl to support protein synthesis is shown in FIG. 4B. Theincrease in cell density three hours after the cells were suspended inmedium enriched with Tfl shows that the incorporation of Tfl intoessential cellular proteins produced after the medium shift step arefunctional to continually support cell growth. This is consistent withthe observation that Tfl can support the exponential phase of cellgrowth (Rennert, O. M.; Anker, H. S. Biochemistry 1963, 2,471).

The ability to control the level of substitution of non-natural aminoacids for natural amino acid counterparts is demonstrated in FIG. 5. Thedegree of substitution is determined through the diminution of leucinemole fractions from amino acid analysis (AAA). MALDI mass spectrometryanalysis was also performed on the protein. It was evident for FA 1-92,the predominant peak observed was 8736 mass units, corresponding to 8substitutions of trifluoromethyl groups over methyl groups (eachsubstitution results in increase of 54 mass units). However, the peakwas broad, indicating the presence of proteins that were not completelysubstituted, as confirmed by the 92% replacement by amino acid analysis.100% replacement was not obtained in the absence of leucine in theexpression medium possibly due to the trace amount of leucines resultingfrom cellular protein degradation. The levels of incorporation could berelated to the amounts of leucine in the expression medium in apredictable fashion. It can be estimated that at 50% substitution can beachieved at a leucine (pure L form) concentration of 8 mg/L (50 μM) andTfl (DL mixture) concentration of 100 mg/L (540 μM), which suggests thatspecificity of Tfl for LeuRS is only reduced five times when compared toleucine.

The overall secondary structures of A1-WT and FA1-92 are essentiallyidentical with the same maximum helicity as shown in FIG. 6. This isessentially due to the fact that Tfl is able to maintain the tightlypackaged protein core.

Thermal and chemical unfolding studies revealed that the fluorinatedpeptides FA1-92, FA1-17, and FA1-29 are highly resistant to both thermaland chemical denaturation. The T_(m) of the fully FA1-92 was elevated by13° C. compared to A1-WT (FIG. 7, Table 3), while C_(m) for FA1-92 wasincreased to 7 M from the measured 2.7 M for A1-WT (FIG. 8, Table 3).These elevations are significant for a protein complex of 8 kDa size.More surprisingly, proteins with low levels of fluorination (FA1-17 andFA1-29) produced the most pronounced increase in stability. This isstriking because at 17% substitution rate, of the six leucine residuesfolded at the helical interface, only one is replaced by 2 on average.However, the single substitution of methyl with trifluoromethyl resultedin increases of T_(m) by 6° C. and C_(m) by approximately 2 M. Thisresult is contrary to the initial proposal that fully fluorinatedproteins are the most stable because fluorocarbons self-aggregate morestrongly in water than hydrocarbons. These results suggest thatintroduction of only a few trifluoromethyl groups into a core ofhydrocarbons is sufficient to raise the protein folding driving forcesignificantly. This result has important ramifications for using Tfl inthe engineering of more stable protein structures.

TABLE 3 T_(m) ^(a) T_(m) ^(b) ΔG° ΔH_(m) ^(d) ΔC_(p) ^(e) C_(urea, 50%)^(f) Protein (10 μM) (1M) (37° C.)^(c) ΔΔG° (1M) (1M) (0° C.) A1-WT 54103 −10.7 −70.9 −252 2.8 FA1-92 67 116 −13.1 −2.4 −77.0 −272 7 ^(a)Themidpoint of the thermal denaturation curve at 10 μM proteinconcentration (PBS, pH 7.4). Units for T_(m) are ° C. ^(b)Midpoint ofthe thermal denaturation curve extrapolated to 1M standard state usingnon-linear least square fit. ^(c)The free energy of folding at 37° C. at1M standard state. Units for ΔG° and ΔΔG° are kcal/mol. ^(d)The enthalpyof folding at the midpoint temperature extrapolated to 1M standardstate, units are in kcal/mol. ^(e)Heat capacity change upon folding at1M standard state, units are in cal/mol-K. The uncertainties for T_(m)(1M), ΔG°, ΔH_(m) and ΔC_(p) are ±1.5° C., 1.2, 4.8 kcal/mol, and 120cal/mol-K, respectively. ^(f)The midpoint urea denaturationconcentration at 0° C. in M of urea.

Global thermodynamic fitting was used to obtain the thermodynamicquantities associated with the monomer to dimer transition for A1-WT andFA1-92. The intermediate level fluorinated proteins, FA1-17 and FA1-29were not fitted using the procedure because the heterogeneity of thesamples would not be described accurately by the two-state model. Thepresence of a heterogeneous population can be seen from the broadeningof the thermal melting curves at substitution rates of 17 and 29%.However, even for the nearly completely substituted FA1-92, the proteinsample was still a mixed population because Tfl was used as an equalmolar mixture of (2S,4S) and (2S,4R) diastereomers. The free energy ofunfolding for FA1-92 was 2.4 kcal/mol more favorable than A1-WT at 37°C., which corresponded to 0.4 kcal/mol of stabilizing energy per Tflmolecule involved at the helix interface. Considering the large numberof leucines packed in the hydrophobic core of proteins, the additivestabilizing effects of Tfl can be quite substantial.

These results demonstrate that it is possible to efficiently incorporateTfl into proteins produced in vivo, control Tfl incorporation ratio,maintain protein secondary and higher order structures and elevate theresistance of the proteins to thermal and chemical denaturation. Sinceleucine is the most abundant abundance of the amino acids in cellularproteins (9%) (T. Creighton, Proteins: Structure and MolecularProperties (W.H. Freeman and Company, New York, 1997) and is especiallyimportant in determining the structure and stability of helix-bundle andother hydrophobic structural motifs. These results indicate the methodsof the invention should apply to any protein with a hydrophobic core. Inaddition, since fluorination results in minimal modifications to proteinstructure and core packing, it is complementary to other existingtechnologies used in elevating protein stability (Lee, B.; Vasmatzis, G.Curr. Opin. Biotech. 1997, 8, 423-426. b. Handel, T. M.; Williams, S. A.DeGrado, W. F. Science, 1993, 261, 879-885. c. Mer, G.; Hietter, H.;Lefevre, J. F. Nat. Struct. Biol. 1996, 3, 45-53. d. Zhang, X, J.;Baase, W. A.; Schoichet, B. K.; Wilson, K. P.; Matthews, B. W. ProteinEng. 1995, 8, 1017-1022). Therefore, fluorination can be used as a final“push” in protein stabilization after other methods such as directedevolution (Arnold, F. Chem. Eng. Sci. 1996, 51, 5091-5102; Giver, L.;Gershenson, A.; Freskgard. P. O.; Arnold, F. H. Proc. Natl. Acad. Sci.USA 1998, 95, 12809-12813; Zhou, Y, F.; Bowie, J. U. J. Biol. Chem.2000, 275, 6975-6979) or rational design (DeGrado, W. F.; Summa, C. M.;Pavone, V.; Nastri, F.; Lombardi, A. Annu. Rev. Biochem. 1999, 68,779-819; Dahiyat, B. L. Science 1997, 278, 82-87) have achieved theinitial gain in stability.

1. A polypeptide with increased stability, relative to its correspondingwild type protein, having at least one non-natural amino acidincorporated into a hydrophobic region of the wild type polypeptide,wherein the amino acid so replaced is leucine, isoleucine, or valine. 2.The polypeptide of claim 1, wherein the polypeptide is a protein.
 3. Thenon-natural amino acid of claim 1, wherein the non-natural amino acid isdifferent from its corresponding natural amino acid in side chainfunctionality.
 4. The polypeptide of claim 1, wherein the non-naturalamino acid is a hydrophobic amino acid selected from the groupconsisting of an unsaturated hydrophobic amino acid; a fluorinatedhydrophobic amino acid; 2-amino-3-methyl-4-pentenoic acid;5,5,5-trifluoroleucine; 5,5,5,5′,5′,5′-hexafluoroleucine;2-amino-3,3,3-trifluoro-methylpentanoic acid;2-amino-3-methyl-5,5,5-trifluoropentanoic acid;2-amino-3-methyl-4-pentenoic acid; 4,4,4-trifluorovaline;4,4,4,4′,4′,4′-hexafluorovalin; homoallylglycine; homoproparglycine; andp-fluorophenylalanine.
 5. A method for increasing stability of apolypeptide comprising introducing at least one non-natural amino acidinto the hydrophobic region of the polypeptide thereby producing apolypeptide with increased stability relative to its corresponding wildtype polypeptide.
 6. The method of claim 5, wherein introducing thenon-natural amino acid into the polypeptide involves replacing anexisting, naturally occurring amino acid with a non-natural amino acid.7. The method of claim 5, wherein introducing the non-natural amino acidinto the polypeptide involves adding the non-natural amino acid into thepolypeptide.
 8. The method claim of 5, wherein the natural amino acid isa hydrophobic amino acid and the non-natural amino acid is a hydrophobicamino acid having side chain functionalities different from itscorresponding natural amino acid.
 9. The method of claim 5, wherein thenatural amino acid so replaced is leucine, and the non-natural aminoacid is 5,5,5-trifluoroleucine.
 10. The method of claim 5, wherein thenaturally occurring amino acid so replaced is leucine, and thenon-natural amino acid is 5,5,5,5′,5′,5′-hexafluoroleucine.
 11. Themethod of claim 5, wherein the naturally occurring amino acid soreplaced is leucine, and the non-natural amino acid is2-amino-4-methyl-4-pentenoic acid.
 12. The method of claim 5, whereinthe naturally occurring amino acid so replaced is isoleucine, and thenon-natural amino acid is selected from the group consisting of2-amino-3,3,3-trifluoro-methylpentanoic acid;2-amino-3-methyl-5,5,5-trifluoropentanoic acid; and2-amino-3-methyl-4-pentenoic acid.
 13. The method of claim 5, whereinthe naturally occurring amino acid so replaced is methionine, and thenon-natural amino acid is homoallyglycine or homoproparglycine.
 14. Themethod of claim 5, wherein the natural amino acid is phenylalanine andthe non-natural amino acid is p-fluoro-phenylalanine.